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I.    INTRODUCTION 


Much  has  been  written  in  the  past  several  years 
about  techniques  for  translating  a  program  written  in  a 
serial  language  (such  as  FORTRAN)  into  an  efficient 
proqram  for  a  parallel  machine.   This  paper  discusses  the 
use  of  these  concepts  in  the  implementation  of  a  FORTRAN 
compiler  for  an  idealised  target  machine.   The  paper  is 
organized  into  three  sections.   First,  an  intuitive  feel 
for  the  various  concepts  and  transformations,  then  a 
discussion  of  applicability,  advantages,  and 
disadvantages  of  this  type  of  compilation,  and  lastly,  an 
outline  of  future  developments.   A  description  of  the 
compiler,  its  modules,  and  data  structures  is  given  in 
the  appendices. 


II.   AN  IHTpiglYS  LOOK 

When  executing  in  a  parallel  machine,  it  is  often 
desirable  to  use  as  much  of  the  machine  a  possible  at  any 
given  time.   When  compiling  a  program  written  in  a  serial 
language  for  a  parallel  machine,  two  options  are 
available  for  achieving  this  goal: 

1)  Recognition  of  statements  that  can  be  done 
simultaneously  for  a  large  number  of  operands  and 
results. 

2)  Discovering  a  partial  execution  ordering  between 
statements. 

The  first  gives  a  lot  of  the  same  thing  to  do  at  any 
given  time,  and  the  second  gives  a  lot  of  possibly 
different  things  to  do.   The  usefulness  of  each  depends 
largely  on  the  type  of  target  machine. 

Unfortunately,  any  algorithm  for  recognizing,  from 
FORTRAN  source  text,  operations  that  can  be  executed 
simultaneously  for  a  large  number  of  operands  and  results 
must  consider  the  partial  ordering  implied  by  the 
programmer  through  statement  orderings  and  subscripts. 
The  task  of  recognizing  these  simultaneous  aspects  of  a 


program  is  divided  into  four  distinct  subproblems,  the 
first  of  which  is  by  far  the  most  difficult: 

1)  Recognition  of  the  partial  ordering  between 
statements  of  the  serial  program,  including 
orderings  between  statements  on  different  iterations 

of  DO  loops. 

2)  Recognition  of  cycles  in  the  partial  ordering. 

3)  Dse  of  the  partial  ordering  and  the  cycles  to 
recognize  array  operations. 

4)  Transformation  of  all  cycles,  which  are  recurrences, 
into  some  form  efficiently  processable  on  the  target 
machine . 


The  algorithms  used  in  this  compiler  were  described 
in  rather  complex  formulations  in  the  original  works  [C1, 
D1,  M,  S1,  T1  ].   Fortunately,  the  implementation  of  the 
theory  is  relatively  straightforward.   For  the  details  of 
each  algorithm,  see  the  references  included  with  each 
section. 


REQUIREMENT 

In  any  compilation  transformation,  there  is  one 
basic  requirement  that  mast  be  met.   The  results  of  the 
program  before  the  transformation  must  be  the  same  as  the 
results  of  the  program  after  the  transformation.   Some 
transformations  exist  which  decrease  or  increase  the 
roundoff  error  associated  with  a  particular  type  of 
calculation.   All  of  the  transformations  in  this  paper, 
excluding  the  recurrence  transformations,  are  result 
preserving  [M1].   The  recurrence  transformations  disturb 
the  roundoff  error  associated  with  the  calculation.   For 
further  details  of  the  errors  associated  with  recurrence 
calculations,  see  [C1]. 

PAPTIAL  ORDERING 


In  a  program  written  for  a  serial  machine,  the  order 
of  execution  of  the  statements  (ignoring  transfers  of 
control)  is  in  the  linear  order  specified  by  the 
programmer.   It  is  quite  obvious  that  most  programs  would 
generate  the  same  results  if  the  statements  of  the 
program  were  carefully  permuted. 


1 

A=1 

2 

B=37 

2 

B=37 

1 

A=1 

3 

C=A*B 

5 

D=A+8  1 

a 

B=11 

3 

C=A+B 

5 

D=A*81 

U 

B=11 

6 

E=A+B 

6 

F=A*B 

Both  of  the  programs  above  generate  the  same  values  for 
the  variables  A,  B,  C,    D,  and  E.   Since  the  ordering 
specified  is  not  unique,  a  partial  ordering  seems  to  be 
all  that  is  required.   A  valid  partial  ordering  is  then 
one  that  always  yields  values  that  are  the  same  as  the 
original  ordering.   Reading  <  as  "must  precede",  consider 
the  valid  partial  orderings  given  here: 

1<2<3<JK5<6 

2<1<5<3<4<6 

1<3<U<6,  1<5,  2<3<4<6 

A  chain  in  a  partial  ordering  is  a  group  of  statements 
for  which  a  complete  linear  ordering  is  specified.   The 
rrinimal  valid  partial  ordering  is  the  valid  partial 
ordering  with  the  shortest  longest  chain  of  orderings. 
The  recognition  of  the  minimal  valid  partial  ordering  is 
equivalent  to  finding  the  minimum  number  of  steps  in 
which  a  program  could  be  computed,  given  an  arbitrary 


number  of  computing  elements  and  an  ideal  data  switch. 
Also,  even  if  only  one,  or  a  portion  of  one,  statement 
can  be  executed  at  a  time  on  a  given  machine,  a  partial 
ordering  allows  the  compiler  to  choose  which  statement  to 
generate  next. 


PAfllAIi  ORDERINGS  arid  ITERATION  BOOflDARISS 

Consider  the  following  program: 

DO  20  I  =  1,  10 
10    B(I)  =1 
20    A  (I)  =  B(3>1) 

A  simultaneous  version  of  each  assignment  statement  in 
this  program  can  easily  be  constructed. 

DO  10  I  =  1,  10 
10   B(I)  =  I 

DO  20  I  =  1,  10 
20   A(I)  =  B(I+1) 

The  problem  comes  in  determining  if  there  exists  a  valid 
partial  ordering  of  the  two  statements  in  the 
simultaneous  program.   In  the  serial  version,  each 


element  of  A  preserves  an  old  value  of  B  and  the  elements 
of  B  are  assigned  consecutive  integers.   If  the  serial 
ordering  is  preserved  in  the  simultaneous  program,  B  has 
the  same  value  as  in  the  serial  program,  but  A  has 
consecutive  integers  also.   If  the  opposite  ordering  is 
assumed,  then  the  values  generated  in  the  simultaneous 
program  are  the  same  as  the  values  generated  in  the 
serial  program.   If  no  ordering  is  assummed,  on  a  •small' 
machine  one  would  have  to  be  selected,  and  if  the  wrong 
choice  were  made,  the  answers  would  not  agree.   The 
reason  for  the  valid  partial  ordering  being  the  inverse 
of  the  serial  ordering  can  be  easily  seen  if  the  DO  loop 
is  written  out,  rather  than  being  expressed  iteratively. 

10  B(1)  =1       iteration  1 

20  A(1)  =  B(2) 

10  B  (2)  =2       iteration  2 

20  A  (2)  =  B(3) 

Statement  20  in  iteration  1  uses  a  value  that  is  modified 
by  statement  10  in  iteration  2.   Hence,  statement  20 
(simultaneous  version)  must  be  executed  prior  to 
statement  1°  (simultaneous  version)  .   It  should  be  noted 
that  the  existance  of  partial  order ings  across  iteration 
boundaries  does  not  reguire  simply  the  inversion  of  the 
ordering  in  the  serial  program. 


DO  10  I  r  1,  10 
30    A  (I)  =  B(I*1) 
UO    B(I)  =1 


CYCLES  and  PARTIAL  ORDERING 


A  cycle  in  a  partial  ordering  is  a  collection  of 
statements  which  are  related  by  a  cycle  in  the  directed 
graph  of  the  precedes  relation. 

4<3<2<5<3<1     (statements  3,  2,  and  5) 

2<2  (statement  2) 

A  cycle  may  occur  only  when  some  form  of  iteration  (IP 
loops,  DO  loops,  etc.)  encloses  the  statements  involved 
in  the  cycle,  and  each  statement  within  the  scope  of  the 
iteration  is  considered  to  be  executed  simultaneously  for 
all  iterations. 


DO  20  I  =  2,    10 
10    B(I)  =  A  (1-1) 
20    A  (I)  =  B(I-1) 

10  B  (2)  =  A<1)      iteration  1 

20  A(2)  =  B(1) 

10  B(3)  -   A  (2)      iteration  2 

20  A  (3)  =  B(2) 

Statement  10  in  iteration  1  must  precede  statement  20  in 
iteration  2.   Similarly,  statement  20  in  iteration  1  must 
precede  statement  10  in  iteration  2.   This  results  in 
10<2C,  and  20<10,  which  is  a  cycle.   Cycles  may  contain 
any  number  of  statements. 

DO  10  I  =  2,    10 
10    D(I)  =  0(1-1) 

D(2)  =  D(1)      iteration  1 
D  (3)  =  D(2)      iteration  2 

Iteration  one  must  be  executed  before  iteration  two. 
This  statement  is  a  cycle,  because  it  must  precede 
itself. 
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From  these  examples,  it  seems  rather  obvious  that 
cycles  in  a  partial  ordering  imply  that  fetching  all  of 
the  operands,  performing  all  of  the  operations,  and 
storing  all  of  the  results  (in  that  order)  will  not 
generate  the  same  values  as  the  serial  program  for  those 
statements  contained  in  the  cycle  if  the  entire  index  set 
is  done  simultaneously.   The  statements  involved  in  a 
cycle  in  a  partial  ordering  constitute  a  recurrence* 
Extensive  work  has  been  done  on  fast,  accurate,  and 
efficient  methods  of  calculating  the  results  of 
recurrences  [C1,  S1  ].   A  few  of  the  simpler  methods  will 
be  presented  in  the  code  generation  section  of  this 
paper. 


STANDARDIZATION  TRANSFORMATIONS 


In  order  to  make  the  calculation  of  the  partial 
ordering  as  simple  as  possible,  all  of  the  index  sets 
discovered  by  the  compiler  (DO  indices,  induction 
variables,  etc.)  are  normalized  to  a  starting  value  of 
zero  and  an  increment  of  one.   This  necessitates  the 
substitution  of  an  expression  using  the  new  index 
variable  for  the  old  index  variable  for  the  entire  scope 
of  the  index  assigned.   Also,  if  the  old  index  variable 
is  used  outside  of  the  scope  of  the  index  set,  then  the 
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correct  value  of  the  old  index  variable  must  be  set  upon 
exiting  from  the  scope  of  the  index  set. 


DISCOVEBY  TRANSFORMATIONS 


In  order  to  recognize  a  minimal  valid  partial 
ordering,  the  identification  of  the  ordered  set  of 
elements  involved  in  the  operations  is  imperative.   In 
FORTRAN,  this  involves  collection  of  information 
concerning  subscripts.   Since  subscripts  are  combinations 
of  scalar  integer  variables  and  integer  constants  in  the 
large  majority  of  cases,  the  probleir  reduces  to 
discovering  as  much  as  possible  about  scalar  integer 
variables.   Several  items  are  easily  recognizable: 

1)  variables  defined  in  a  DATA  statement  and  not 
changed 

2)  variables  assigned  expressions  of  known  scalar 
integer  variables  and  constants 

3)  variables  representing  index  sets 

Of  course,  when  information  concerning  the  value  of 
variables  is  collected  at  compile  time,  the  scope  of  the 
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value  of  the  variable  must  be  considered  [A1  ], 


ESTABLISHING  A  NEAR  MINIMAL  VALID  PARTIAL  ORDERING 

A  pair  of  statements  A  and  B  must  be  related  by 
A  <  B  if  statement  A  is  executed  before  statement  B  in 
the  serial  program  and  if  at  least  one  of  the  following 
is  true: 

1)  Statement  A  modifies  the  value  of  a  variable  that 
statement  B  uses  to  generate  a  result. 

C  =  D  statement  A 

E  =  C  statement  B 

2)  Statement  B  modifies  the  value  of  a  variable  that 
statement  A  uses  to  generate  a  result. 

C  =  D  statement  A 

D  =  P  statement  B 

3)  statement  A  and  statement  B  both  modify  the  value  of 
the  same  variable. 

C  =  D  statement  A 

C  «  p  statement  B 

To  construct  a  valid  partial  ordering,  all  pairs  of 
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statements  must  be  tested  for  satisfying  these 
conditions.   If  a  statement  is  involved  in  an  iteration, 
then  each  occurrence  of  that  statement  must  be  tested 
with  each  occurrence  of  every  statement  within  the  scope 
of  the  iteration  (including  other  occurrences  of  itself) • 
For  a  program  that  does  a  large  amount  of  iteration,  this 
exhaustive  method  is  extremely  expensive  in  compilation 
time.   A  near  minimal  solution  that  is  inexpensive  is 
obviously  desirable, 

A  near  minimal  solution  can  be  achieved  by 
successive  approximation.   First,  each  statement  is 
tested  against  only  those  statements  that  can  follow  it 
in  execution  on  a  serial  machine.   If  a  statement  is 
inside  the  scope  of  an  iteration,  then  it  is  tested 
against  every  statement  inside  the  scope  of  the  iteration 
(including  itself  for  tests  1  and  2).   To  make  the  tests 
guick  and  simple,  only  variable  names  are  used  (not 
subscript  information).   The  complexity  of  the  tests  and 
the  number  of  tests  is  kept  small  by  treating  each 
occurrence  of  a  statement  from  an  iteration  as  identical 
references  to  the  entire  array. 
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statement 

1 

READ(1,1)  B 

2 

A  =  B  ♦  C 

3 

M  =  21 

U 

DO  40  I  =  2,  M 

5 

D(I)  =  A 

6 

E(D  =  D(I) 

7 

P(I)  =  F(I-1) 

8 

G(I)  =  H(I-1) 

9 

H(I)  =  G(I-1) 

10 

40 

CONTINDE 

11 

WRITE(2,2)  H,  G,  A 

1<2<5<11    3<U<8<9<11   9<8 
U<5<6    U<7   7<7   6<5 


This  approximation  to  the  minimal  valid  partial 
ordering  is  minimal  where  subscript  information  does  not 
exist,  and  is  always  conservative  (ie«,  always  contains 
the  minimal  valid  partial  ordering  as  a  subset)  in  those 
cases  with  subscripts.   In  fact,  two  statements  will  be 
ordered  by  this  approximation  and  not  by  the  minimal 
valid  partial  ordering  only  if  one  of  the  statements  has 
a  result  variable  that  is  present  in  the  other  statement, 
but  the  subscripts  expressions  are  such  that  the  two 
occurrences  of  that  single  variable  form  nonintersecting 
sets.   Many  people  in  that  past  have  considered  this  to 
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be  a  sufficiently  close  approximation  to  the  minimal 
valid  partial  ordering  [B1]»   Because  of  the  work  done  by 
[K2],  a  closer  approximation  to  the  minimal  valid  partial 
ordering  seems  desireable. 

The  second  approximation  is  to  check  some  limited 
amount  of  subscript  information  on  those  cases  whose 
variable  names  compare  and  subscript  information  is 
present.  \   given  variable  generally  has  as  many 
subscript  expressions  as  it  has  dimensions.   If  the 
subscript  expression  for  dimension  I  of  one  occurrence  of 
the  variable  will  never  equal  the  subscript  expression 
for  dimension  I  of  the  other  occurrence  of  the  variable, 
then  the  two  statements  are  not  related  by  the  <  relation 
(ie. ,  independent)  with  respect  to  the  interaction  of 
these  two  occurrences  of  this  variable. 

A  (I,3,K)  = 

-  MI,0,K) 

Since  3  is  not  equal  to  4,  the  two  statements  are 
independent  with  respect  to  these  two  occurrences  of  the 
variable  A.   The  situation  can  become  much  more  complex. 
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A (I, J)  =  

=  A  (I,K*J) 

Depending  upon  the  value  of  K,  the  two  statements  may,  or 
may  not  be  independent  with  respect  to  the  variable  A, 
If  a  range  of  values  can  be  determined  by  one  of  the 
discovery  transforms,  then  a  determination  can  be  made. 
Otherwise,  the  <  relation  must  be  assummed  between  the 
two  statements.   By  checking  the  subscripts 
independently,  minimality  is  approached,  but  still  not 
achieved. 

DO  10  I  =1,  20 

A(7*I,T)  =  

=  A(I,3*I) 

10   CONTINUE 

A  (7,1)  =  ... 

...  =  A  (1,3) 


A(21,3)  =  .  .. 


...  =  A  (7,21)    independent 
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Even  checking  only  one  subscript  at  a  time,  the  general 
solution  is  extremely  difficult  without  enumeration.   The 
second  approximation  must  be  even  more  limited.   Only 
specific  types  of  subscript  expressions  are  considered. 
A  subscript  that  can  be  expressed  as  a  linear  combination 
of  index  set  variables  (Xi)  and  known  values  (Ci)  is 
called  a  linear  subscript. 

CC  ♦  C1*X1  ♦  C2*X2  ♦  .. . 

The  detection  of  intersecting  subscript  expressions 
becomes  considerably  simpler  if  only  linear  subscripts 
are  considered,  and  a  large  majority  of  subscripts  appear 
to  fall  into  this  catagory.   If  an  arbitrary  number  of 
variables  associated  with  index  sets  is  allowed  in  each 
subscript,  the  problem  is  still  very  difficult  to  solve 
without  enumeration.   For  this  reason,  only  cases 
involving  one  index  set  per  subscript  are  currently  being 
considered.   Plans  are  made  to  consider  two  index  sets 
per  subscript  at  some  future  time. 

In  short,  the  second  approximation  is  applied  only 
if  the  first  approximation  has  found  an  ordering  to  be 
implied  by  a  variable  that  occures  in  both  statements 
with  subscripts.   Then  each  subscript  pair  (corresponding 
positions  from  each  occurrence)  is  checked  to  insure 
linearity  and  that  at  most  one  index  set  is  used  in  the 
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subscript  expression  foe  some  dimension.   In  those  cases 
where  all  of  these  conditions  are  met,  a  further  test  is 
made.   Otherwise,  the  <  relation  is  assummed  (a 
conservative  assumption) .   For  the  details  of  this  test, 
see  the  appendix  section  labeled  DEPEND.   Plans  are  being 
made  to  experiment  with  other  tests  to  obtain  an 
understanding  of  the  complexity  and  effectiveness  of 
calculating  a  partial  ordering. 


REDUCTION  OF  A  PARTIAL  ORDERING 


The  length  of  the  chains  in  a  partial  ordering  may 
be  reducable  by  the  technigue  of  forward  substitution. 
If  statement  A  modifies  a  variable  that  statement  B  uses 
and  statament  A  is  executed  before  statement  B  in  the 
serial  program,  then  the  partial  ordering  will  show  that 
A  <  B.   This  relation  can  be  removed  from  the  partial 
ordering  by  substituting  the  expression  for  the  variable 
that  is  modified  by  statement  A  into  the  occurrences  of 
that  variable  in  statement  B. 

A  =  B(I)  ♦  C  *  D  before 

F  *  A  *  D  ♦  H 
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A  =   B(I)  ♦  C  *  D  after 

P   =  (B(I)  ♦  C  *  D)  *  D  ♦  H 

With  subscripted  variables,  the  problem  becomes  slightly 
more  complex,  as  the  values  of  subscripts  may  have  to  be 
ad-Justed. 

A  (I)  =   B  ♦  C  (J-I)  before 

F(I)  =  A(I)  ♦  A  (1-3) 

A  (I)  ■  B  ♦  C(J-I)  after 

F(I)  =  B  ♦  C(J-I)  ♦  B  ♦  CfJ-X+3)    ?? 

The  substitution  of  subscripted  variables  inside  the 
scope  of  an  index  set  raises  special  problems  because  of 
boundary  conditions  (as  seen  in  the  previous  example)  • 
Because  of  the  difficulties  involved,  the  benefits  must 
be  examined  to  see  if  the  gains  are  worth  the  trouble. 
Forward  substitution  reduces  the  number  of  <  relations  in 
the  oartial  ordering  while  increasing  the  computation 
time  for  individual  statements.   If  each  individual 
statement  describes  a  computation  that  will  saturate  the 
target  machine,  then  forward  substitution  will  result  in 
a  program  that  is  more  parallel,  but  will  execute  slower. 
Because  of  this,  forward  substitution  should  only  be 
considered  when  the  statement  into  which  the  substituting 
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is  being  done  will  not  saturate  the  target  machine.   For 
the  purposes  of  this  compiler,  any  statement  inside  the 
scope  of  an  index  set  is  assummed  to  be  able  to  saturate 
the  target  machine.   This  assumption  is  reasonable  on 
currently  constructable  machines  (a  very  finite  number  of 
processors).   Because  of  this,  forward  substitution  is 
only  done  outside  the  scope  of  an  index  set. 


IDENTIFICATION  OF  CYCLES  IN  T.HJ  PARXIAL  ORDERING 

Once  a  valid  partial  ordering  is  established  as  the 
approximation  to  the  minimal  valid  partial  ordering,  all 
of  the  cycles  in  the  partial  ordering  (recurrences)  must 
be  found.   This  implies  that  all  of  the  transative 
relations  must  be  filled  in. 

A<B      B<C      ==>      k<C 

Since  the  statements  of  a  program  can  be  viewed  as 
vertices  of  a  directed  graph,  and  the  <  relation  as  a 
directed  edge,  the  cycles  can  be  found  by  one  of  the 
standard  methods  of  finding  all  of  the  paths  of  all 
lengths  on  a  directed  graph  [f1]«   The  cycles  are  then 
identified  by  looking  for  maximal  totally  connected 
subgraphs  (PI  blocks)  of  the  directed  graph  [B1].  k   new 
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directed  graph  can  be  constructed  using  the  PI  blocks  as 
vertices  and  the  edges  that  are  unigue  between  PI  blocks 
as  edges  in  the  new  graph.   This  graph  represents  the  the 
partial  ordering  of  the  PI  blocks.   Each  PI  block 
represents  the  smallest  set  of  statements  that  can  be 
executed  by  itself  for  an  entire  index  set.   There  are 
two  types  of  vertices  in  this  new  graph,  those  with  self 
loops,  and  those  without.   A  vertex  without  a  self  loop 
represents  a  statement  that  may  be  executed 
simultaneously  for  all  values  of  the  index  set.   A  vertex 
with  a  self  loop  represents  a  set  of  statements  that  form 
a  recurrence  relation. 


CONDITIONAL  STATEMENTS 


The  methods  presented,  to  this  point,  work  very  well 
as  long  as  no  conditional  transfer  of  control  takes 
place.   When  a  conditional  statement  is  encountered, 
there  are  three  possibilities: 

1)  Do  nothing  and  take  your  lumps  by  assumming  no 
overlap  between  successively  executed  portions 

of  the  program. 

2)  Attempt  to  compute  as  few  conditional  statements 
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as  possible  by  combining  conditional  statements. 

3)   Attempt  to  remove  conditional  statements  from 
the  scope  of  index  sets. 

It  is  advantageous  to  execute  as  few  conditional 
statements  as  possible  on  almost  any  machine  with  some 
type  of  overlap.   This  advantage  is  magnified  by 
pipelining,  and  by  true  parallel  processing.   The  basic 
problem  is  that  when  a  conditional  statement  is 
encountered  in  the  execution  stream,  the  next  section  of 
code  to  be  executed  is  not  known.   This  implies  that  all 
statements  that  are  executed  on  some,  but  not  all,  of  the 
branch  paths  out  of  a  conditional  statement  must  wait 
(have  a  <  relation)  for  the  completion  of  the  conditional 
statement  prior  to  becoming  eligible  for  execution.   In 
some  cases,  this  delay  is  so  large  as  to  allow  the 
computation  of  all  paths  of  a  conditional  statement 
during  the  delay.   With  the  use  of  the  IF  tree 
transformations  [D1],  all  or  a  portion  of  the 
calculations  involved  in  the  various  paths  from  the 
conditional  statement  may  be  overlapped  with  the 
computation  of  the  conditional.   Thus,  when  the  correct 
branch  is  selected,  only  store  or  move  operations  need  to 
be  done. 


23 

Inside  the  scope  of  an  index  set,  the  problem  is 
magnified  by  the  size  of  the  index  set,  and  by  the 
partial  orderings  that  exist  across  iteration  boundaries 
that  may  (conditionally)  be  introduced.   While  it  would 
he  highly  desirable  to  remove  all  conditional  statements 
from  the  scope  of  an  index  set  by  some  automatic  means, 
presently  available  techniques  only  allow  for  the  removal 
of  a  large  percentage  [T1].   The  types  of  conditional 
statements  that  can  be  easily  removed  are  separatable 
into  three  classes.   First,  if  the  conditional  statement 
does  not  have  any  <  relations  with  any  statements  within 
the  scope  of  the  index  set,  and  is  not  dependent  upon  the 
index  set.   This  means  that  the  branch  of  the  conditional 
statement  to  be  executed  will  be  determined  prior 
entering  the  scope  of  the  index  set.   The  program  can 
then  be  rewritten  by  replicating  the  common  code  between 
the  various  cases,  inserting  the  special  code  for  each 
possibility  ,  and  placing  a  slightly  revised  conditional 
statement  outside  the  scope  of  all  of  these  index  sets. 

Do  ar  I  =  1,  10  before 

A  (I)  =  B(I) 

IF  (  C  .GT.  0  )  GO  TC  20 
D(I)  =  17 
GO  TO  40 
20  D(I)  =  11 
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40  CONTINUE 

IF  (  C  .GT.  0  )  GO  TC  25       after 

DO  20  I  =  1,  10 

A  (I)  =  B(I) 

D(I)  =  17 
20  CONTINUE 

GO  TO  45 
25  DO  U0  I  =  1,  10 

A  (I)  =  B(I) 

D(I)  =  11 
40  CONTINUE 
45  CONTINUE 


Secondly,  if  the  conditional  statement  has  a  < 
relation  with  the  index  set  creator,  then  mode  patterns 
must  be  generated,  so  that  each  of  the  new  index  sets  may 
be  executed  for  the  correct  values.   Exactly  one  of  the 
index  sets  (when  modified  by  the  appropriate  mode 
pattern)  will  contain  a  given  value  from  the  original 
index  set.   When  the  conditional  statement  has  a  < 
relation  with  statements  common  to  all  possible  execution 
paths  through  the  conditional,  and/or  has  a  <  relation 
with  statements  in  at  most  one  of  the  possible  execution 
paths,  then  this  conditional  may  be  removed  by  the  use  of 
mode  patterns  also. 
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Thirdly,  for  all  remaning  conditional  statements, 
there  is  a  test  for  the  complexity  of  the 
inter dependencies  represented  by  the  <  relations  between 
the  various  execution  paths  and  the  conditional  statement 
[T1].   This  test  was  deemed  too  complex  to  implement  in 
the  initial  version  of  the  compiler.   For  this  type  of 
conditional,  the  PI  block  containing  the  conditional 
statement  is  marked  for  serial  execution. 


CODE  GENERATION 


At  this  point,  the  original  program  has  been 
transformed  into  a  set  of  partially  ordered  statement 
qroupings.   Por  each  of  the  statement  groupings  (PI 
blocks) ,  code  is  compiled  independently  of  the  partial 
orderings.   If  the  PI  block  has  no  self  loop  in  the 
partial  ordering,  then  it  represents  some  type  of  scalar 
operation  performed  on  a  possibly  large  number  of 
elements.   When  a  PI  block  has  a  self  loop,  it  represents 
a  recurrence. 

Because  of  the  three  address  machine  chosen  for  the 
target  machine,  and  the  transformation  that  changes  the 
original  FORTRAN  program  into  statements  having  only  one 


26 

operation,  at  most  two  operands,  and  one  result,  code 
generation  for  PI  block  without  self  loops  is  extremely 
trivial.   Subscript  information,  if  present,  is  extracted 
for  the  operands  and  the  result.   Similar  information  is 
collected  for  the  upper  bounds  of  all  of  the  index  sets 
(recall  that  the  lower  bound  is  always  zero,  and  the 
increment  is  always  one) .   The  operation  being  performed 
is  translated  to  the  appropriate  machine  code.   (Builtin 
functions  of  the  serial  language  are  assummed  to  be 
implemented  as  a  single  instruction  in  the  target 
machine.)  The  entire  collection  is  then  packaged  with  the 
operand  and  result  addresses,  to  form  one,  rather  large, 
machine  instruction.   Quite  obviously,  conventional 
machine  code  could  be  generated  at  this  point  also. 

Recurrences  are  a  little  more  complex.  h   test  is 
performed  first  to  see  if  the  recur rnce  is  linear.   A 
linear  recurrence  is  a  recurrence  which  can  be  described 
by 

n-1 
Xn  =    Cn  ♦  SUM  (  Bi  *  Xi  ) 
i=0 

At  present,  if  a  recurrence  fails  this  test,  then 
explicit  code  is  generated  to  execute  the  recurrence  in  a 
serial  manner.   Certain  improvements  in  this  approach  are 
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possible  with  in  current  knowledge  (see  future 
developments) •   When  the  recurrence  is  linear,  the  set  of 
assignment  statements  that  form  the  recurrence  are 
converted  to  a  lower  triangular  banded  matrix  for 
calculation  by  the  fastest  known  method  [C1,  S1],   This 
method  is  intuitive  for  simple  cases. 

DO  10  I  =  1,  10 
Z(I)  =  3  *  Z(I-1)  ♦  C  (I) 
10   CONTINUE 

By  writing  out  all  10  equations,  and  ignoring  little 
details  like  array  bounds,  a  set  of  linear  equations 
suitable  for  the  backwards  substitution  portion  of 
Gaussian  elemination  is  obtained: 

Z(1)  =  C(1)  ♦  3*Z  (0) 

Z(2)  =  C(2)  +0       ♦  3*Z(1) 

Z(3)  =  C(3)  +0       ♦  0       ♦  3*Z(2) 
etc. 

This  can  be  written  in  the  matrix  form  z  =  A  *  z  *•  c. 
The  target  machine  is  assumed  to  have  an  instruction  to 
compute  recurrences  that  can  be  expressed  in  this  form. 
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The  appropriate  values  of  A  and  c  must  be  collected. 
If  constants  are  used  as  coefficents  in  the  recurrence, 
then  these  elements  can  be  calculated  at  compile  time. 
When  variables  are  used,  the  A  and  c  operands  must  be 
seeded  at  run  time  with  the  appropriate  values.   This 
necessitates  the  generation  of  scalar  operations  whose 
result  and  one  operand  are  A  or  c.   At  most  one  such 
instruction  per  statement  involved  in  the  recurrence  is 
needed  to  fill  c,  while  at  most  one  instruction  per  right 
hand  side  variable  (which  is  involved  in  the  recurrence) 
is  needed  to  initialize  A.   Once  the  result  of  the 
recurrence  is  calculated,  the  results  must  be  dispersed 
to  the  appropriate  left  hand  side  variables.   This 
reguires  one  instruction  per  left  hand  side  variable. 
(If  only  one  left  hand  side  variable  exists,  then  this 
step  is  not  necessary.) 
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III.  THE  NEED  FOR  THESE  TECHNIQUES 

As  truely  parallel  machines  become  more  and  more 
practical  and  available,  a  large  volume  of  programs  that 
are  currently  coded  in  a  serial  language  will  have  to  be 
converted  to  a  more  parallel  form.   While  there  are 
several  choices  on  the  conversion  question,  almost  all  of 
them  are  prohibitively  expensive: 

1)  have  an  analyst  construct  parallel  versions  of  the 
serial  algorithms. 

2)  run  the  program  in  its  serial  form  on  the  parallel 
machine . 

3)  automatically  convert  the  serial  version  to  a  more 
parallel  form. 

Eecause  of  the  cost  of  computational  resources  and 
personnel,  the  only  viable  alternative  is  automatic 
conversion.   The  compilation  techniques  used  in  this 
compiler  are  not  only  result  preserving,  but  produce 
efficient  code,  in  most  instances,  for  parallel  machines. 

At  first  glance,  it  appears  that  the  translation  of 
a  program  in  a  serial  language  to  a  program  in  a  parallel 
form  would  be  useful  only  on  machines  that  have  vector 
operations  like  STAR  or  ILLIAC  IV.   While  the  parallel 
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version  of  the  serial  program  is  guite  useful  for 
machines  such  as  these,  a  rather  large  number  of 
applications  exist  for  more  conventional  machines.   One 
way  to  look  for  advantages  of  the  parallel  version  of  a 
program  over  the  serial  version  is  to  consider  the  two 
ma-jor  differences: 

1)  At  any  point  in  a  parallel  program,  at  least  one 
statement  is  available  for  execution.   At  any  point 
in  a  serial  program,  only  one  statement  is  available 
for  execution. 

2)  Index  sets  in  a  parallel  program  contain  only  one  of 
the  smallest  set  of  statements  that  must  be  executed 
together.  Index  sets  in  a  serial  program  contain  at 
least  one  of  these  smallest  units. 

These  distinctions  seem  rather  trivial  for  a  serial 
machine  until  the  individual  components  of  a  particular 
serial  machine  is  examined.   If  the  machine  has 
independent  functional  units  like  the  CDC6600,  the  wider 
the  choice  of  instruction  orderings  available  to  the 
compiler,  the  more  efficient  that  the  generated  code  can 
be.   If  the  machine  has  a  small  instruction  cache,  then 
the  smallest  possible  loop  has  a  better  chance  of  fitting 
within  the  cache.   If  the  machine  has  a  memory  structure 
designed  to  reference  contiguous  locations  efficiently 
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(interleaving,  interlacing,  multiword  fetches,  etc.), 
then  the  fewer  the  number  of  operand  streams  existing 
within  a  loop,  the  easier  the  storage  assignment  problem 
is  for  utilization  of  these  memory  structures. 
Scheduling  a  program  on  a  multiprocessor  system  such  as 
C.mmp  becomes  trivial,  as  each  PI  block  could  be  assigned 
to  its  own  processor  as  it  becomes  eligible  for 
execution.   Paging  behavior  of  a  program  is  also  improved 
by  the  localization  of  the  instruction  and  data 
references. 

Some  standard  compilation  transformations  are  also 
easily  obtainable  from  the  parallel  version  of  a  program. 
If  a  calculation  precedes  no  other  calculation,  then  it 
is  redundant,  and  may  be  pruned  from  the  execution  graph. 
If  a  computation  is  inside  the  scope  of  an  index  set  and 
does  not  use  the  variable  associated  with  the  index  set, 
and  is  not  a  recurrence,  then  the  index  set  may  be 
deleted  (equivalent  to  moving  a  statement  outside  the 
scope  of  an  index  set) . 

The  disadvantages  of  the  parallel  form  of  a  program 
written  in  a  serial  langage  are  derived  from  the  same 
areas  as  the  advantages.   By  making  the  statements  within 
the  scope  of  an  index  set  as  few  as  possible,  the  number 
of  index  sets  is  (possibly)  increased.   This  means  that 
there  are  more  incrementing  and  testing  operations  to  be 
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done  to  complete  the  execution  of  the  entire  program  than 
in  the  serial  form.   By  doing  forward  substitution,  more 
computations  must  be  done  to  achieve  the  same  result.   So 
the  penalty  that  is  paid  for  the  increased  parallelism  is 
a  little  more  work  that  must  be  done.   For  some  machines, 
the  advantages  easily  outway  the  disadvantages,  and  for 
others,  it  the  decision  is  made  just  as  easily  in  the 
other  direction.   At  present,  a  signifigant  number  of 
machines  appear  to  be  on  the  borderline.   As  machines 
become  more  and  more  parallel,  the  translation  technigues 
used  in  this  compiler  will  become  more  and  more  useful. 
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IV.   POTURE  DEVELOPMENTS 


There  are  several  areas  in  which  the  algorithms  used 
in  this  compiler  can  be  improved: 

1)  In  the  area  of  recurrences,  the  cutting, 
folding  and  splitting  theorems  [C1]  would 
improve  efficiency  of  the  resulting  code. 

2)  The  approximation  to  the  minimal  valid  partial 
ordering  can  be  improved  by  treating  each  of 
the  three  ordering  conditions  seperately  [K1]. 

3)  The  efficiency  of  conditional  statements  inside 
the  scope  of  index  sets  may  be  improved  by  a 
more  complete  implementation  of  the  IF  removal 
theorems  [  T1 ]. 

4)  The  conditional  statements  outside  the  scope  of 
all  index  sets  could  be  handled  by  the  method 
of  IF  trees  (  D1  ]. 

5)  More  index  sets  could  be  found  by  recognizing 
induction  variables. 
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6)  Special  types  of  recurrences  could  be 
recognized  for  vhich  an  efficient  method  of 
computation  is  well  known  [H1]» 

7)  COMHON  and  EQUIVALENCE  statements  must  be 
considered* 

8)  Expand  some  subroutines  in  line. 

Work  is  presently  in  progress  in  the  first  four  areas, 
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APPENDIX  A 
The  Structure  of  the  Compiler 

Because  the  compiler  was  designed  as  a  test  frame 
for  compilation  techniques,  modifiability  was  of  prime 
importance,  as  was  the  ability  to  select  which 
transformations  should  be  applied  during  a  particular 
compilation.   For  this  reason,  an  extremely  modular 
compiler  was  designed  with  all  transformations  modifying 
the  same  data  structure.   The  algorithms  used  in  each  of 
the  modules  is  outlined  in  appendix  B.   The  major  data 
structures  are  described  in  appendix  C.   A  large  set  of 
option  switches  was  implemented  for  controlling 
transformations  and  the  volume  of  output. 

Programs  are  first  processed  by  a  lexical  analyzer 
(routine  SCANNER)  into  a  tokenized  version  of  the 
program.   The  output  of  the  lexical  analyzer  is  a  symbol 
table,  a  brief  description  of  each  statement  of  the 
program,  and  a  detailed  token  by  token  representation  of 
those  statements  in  the  program  that  have  a  variable 
number  of  items  associated  with  their.   These  three 
structures  (SYMTAB,  PROGRAM,  and  DETAIL)  are  passed  to 
the  rest  of  the  compilation  process  by  a  disk  file.   A 
program  need  only  be  lexically  analyzed  once. 
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When  the  compiler  proper  (routine  MASTER)  recieves 
control,  the  current  compiler  options  (which 
transformations,  outputs  desired,  etc.)  are  initialized 
(routine  OPTIONS) .   Then  the  disk  file  of  the  tokenized 
program  is  read  (routine  BEADIN) .   A  POBTBAN  version  of 
the  tokenized  program  is  printed  (routine  UNSCAN) •   The 
compiler  then  gives  the  user  the  opportunity  to  change 
the  value  of  any  scalar  integer  variables  by  inserting 
DATA  statements  into  the  program  (routine  CCABDS) • 

The  index  sets  of  the  program  are  located  and 
normalized  to  run  from  zero  to  some  upper  bound  with  an 
increment  of  one  (routine  DONIFS) .   All  known  expressions 
of  index  sets  and  constants  are  substituted,  following 
standard  scope  rules  (routine  DONIFS) .   A  FORTBAN  version 
of  the  transformed  program  is  printed  (routine  UNSCAN) . 
The  compiler  user  has  the  opportunity  to  redefine  the 
upper  bounds  of  index  sets  by  inserting  DATA  statements 
(routine  CCARDS) .   (Note  that  the  magnitude  of  the  upper 
bound  of  an  index  set  may  make  considerable  difference  in 
the  partial  ordering.)  The  subscript  expressions  and 
index  set  upper  bounds  are  converted  (if  possible)  to  a 
parenthesis  free  representation  by  the  use  of 
distribution  and  other  basic  algebraic  identities 
(routine  INDEX) .   A  FORTRAN  version  of  the  transformed 
program  is  printed  (routine  ONSCAN) . 
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For  every  contiguous  block  of  statements  outside  of 
the  scope  of  all  index  sets,  and  containing  no  branch 
target,  statement  forward  substitution  is  done  to 
increase  the  width  of  the  expression  evaluation  trees  and 
to  eleminate  partial  ordering  restrictions  within  the 
block.   A  partial  ordering  is  developed  (routines  IOLISTS 
and  DEPEND)  for  the  block,  and  statement  level 
substitution  done  between  dependent  statements  (routine 
FORWARD) .   A  FORTRAN  version  of  the  transformed  program 
is  printed  (routine  UNSCAN) • 

When  selected  by  the  user,  conditional  statement 
removal  is  done  by  calling  routine  IFREMOV.   IFREMOV 
removes  the  conditional  statements  from  the  scope  of 
index  sets,  if  possible.   IFEEMOVE  uses  routines  SEGMENT, 
IOLISTS,  and  DEPEND.   The  routine  IFTREE  is  called,  at 
user  option,  to  combine  all  conditional  statements 
outside  the  scope  of  index  sets.   Routine  IFCLUB  is 
called  to  remove  all  remaining  conditional  statements 
from  the  program  by  assuming  the  worst  possible  type  of 
overlap  —  none.   IFCLUB  does  this  fcy  breaking  the 
program  into  segments  of  straight  line  code.   Routine 
SEGMENT  is  used  to  locate  all  of  these  segments. 

The  program  is  then  converted  to  three  address 
FORTRAN  code  (ie.,  one  result  and  at  most  two  operands 
per  assignment  statement)  (routine  INDEX) .   A  FORTRAN 
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version  of  the  transformed  program  is  printed  (routine 
UNSCAN) .   Except  for  some  minor  differences  in  DO  loop 
bounds,  the  tokenized  version  of  the  program  could  have 
come  directly  from  the  lexical  analyzer.   The  original 
FORTRAN  program  could  have  been  coded  in  this  form. 

Now,  the  traditional  compilation  process  begins 
(routine  COMPILE) •   A  list  of  variables  used  by  each 
statement  is  built  along  with  a  list  of  variables 
modified  by  each  statement  (routine  IOLISTS) •   Then  a 
irinimal  valid  partial  ordering  is  approximated  (routine 
DEPEND) •   All  of  the  transative  orderings  are  discovered, 
and  maximal  totally  connected  subgraphs  (PI  blocks)  of 
the  program  graph  are  located.   A  partial  ordering 
between  the  PI  blocks  is  deduced  from  the  ordering  in  the 
program  graph  (routine  PIPART) .   Code  is  then  generated 
for  each  PI  block  (routine  GENER) .   The  tokenized  version 
of  the  FORTRAN  program  (after  all  transformations)  is 
then  output  along  with  the  code  that  was  generated 
(routine  WRITEIT) . 
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APPENDIX  B 


Individual  Module  Descriptions 
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CCARDS 


FONCTION.1 

To  define  scalar  integer  variables  (as  if  coded  in  DATA 
statements) 

METHOD ; 

A  CHARACTER (6)  VARYING  value  is  read  from  the  file 
CONTROL.   If  the  value  is  ***EOF,  the  CCARDS  routine 
returns  control  to  the  calling  procedure.   Otherwise, 
this  value  is  looked  up  in  the  symbol  table  to  see  if  it 
is  the  name  of  a  scalar  integer  variable.   If  so,  a  FIXED 
BTNRFY(15)  variable  is  read  from  the  file  CONTROL,  and 
the  appropriate  symbol  table  entry  is  updated. 
Appropriate  error  messages  are  output  when  any  unexpected 
condition  is  met.   The  method  is  then  repeated. 


PARAMETERS: 


PRFLAG  -  the  debug  print  flag  -  BIT  (1) 
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CONDITIONS :_ 

ENDFILE (CONTROL)  -  equivalent  to  reading  ***EOP 
UNDFFINFDFILE (CONTROL)  -  equivalent  to  reading  ***EOF 
CONVERSION  -  the  value  causing  the  error  is  ignored 

MESSAGES.: 

CCAPDS  CALLED 

CCARDS  UNABLE  TO  FIND  VARIABLE  "XXXXXX" 

CCARDS  VARIABLE  "XXXXXX"  IS  NOT  A  SCALAR  INTEGER 
VARIABLE,  AND  CANNOT  BE  INITIALIZED 

CCARDS  VARIABLE  "XXXXXX"  HAS  VALUE  ### 

CCAPDS  CONVERSION  ERROR  ON  INPUT.   CHARACTER  "X"  IS  IN 
ERROR.   SKIPPING  TO  NEXT  RECCGNIZABLE  INPUT 

CCARDS  PFTURNS 
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COMPILE 


FONCTION.1 

The  direction  of  the  code  generation  process  for  a 
program  segment. 

METHOD :_ 

The  input/output  list  is  built  for  the  program  segment 
using  the  routine  IOLISTS.   A  particular  DO  loop 
parallelization  is  selected  from  the  file  DOPARAL  if 
FLAG.DO_LIST  is  true.   Otherwise,  the  all  parallel  case 
is  selected.   The  data  dependency  graph  is  built  using 
the  routine  DEPEND,  and  this  graph  is  partitioned  using 
the  routine  PIPABT.   The  actual  code  is  then  generated  by 
calling  the  routine  GENEB.   The  compiled  code  is  then 
written  to  the  various  output  files  by  calling  the 
routine  tfRITEIT.   If  FLAG.PERMOTE  and  PLAG.DO_LIST  are 
both  false,  then  COMPILE  returns.   If  PLAG. PERMUTE  is 
true,  and  PLAG.  DO_LIST  false,  the  next  parallelization  in 
the  permutation  seguence  is  selected.   If  FLAG.DO_LIST  is 
true,  a  DO  loop  parallelization  is  selected  from  the  file 
DOPARAL.   The  method  is  repeated  until  all  permutations 
are  exhausted,  or  until  a  negative  parallelization  is 
encountered  on  file  DOPARAL. 
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PARAMETERS: 

None 

CONDITIONS: 


FNDFILE  (DOPARAL)  -  causes  FLAG.DO_LIST  to  be  set  false 

UNDFFINEDFILE(DOPARAL)  -  causes  FLAG.DO_LIST  to  be  set 

false 


MESS^GESj. 

COMPILE  CALLED  PROGRAM_PTR  =  ### 

C0I1PILF  LIST  SPECIFICATION  IMPOSSIBLE.   NO  FILE  DOPARAL . 
(3NC0DE=###)  DEFAULT  PARALLELIZATION  ASSUMMED. 

COMPILE  END  OF  FILE  ON  DOPARAL.  MISSING  NEGATIVE 
PARALLELIZATION.  DEFAULT  PARALLELIZATION 
ASSUMMED. 

COMPILE  LIST  PARALLELIZATION  #  #  #  #  I 

COMPILE  PERMUTE  PARALLELIZATION  #  #  #  #  # 

COMPILE  DEFAULT  PARALLELIZATION  #  #  t  #  #...... 

COMPILE  RETURNS 
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DEPEND 


FUNCTION: 

To  identify  all  of  the  "must  precede"  (  <  )  relations 
specified  by  the  statement  orderings  and  subscript 
information. 

METHOD: 

First,  the  number  of  statements  and  the  number  of  index 
sets  present  in  the  program  are  counted.   The  scope  of 
each  of  the  index  sets  is  saved  for  later  use.   Every 
input  variable  is  then  compared  with  every  output 
variable,  and  every  output  variable  is  compared  with 
every  output  variable  to  check  for  data  dependence,   if 
the  statements  containing  the  variables  being  compared 
already  are  related  by  <  relations  in  both  directions  (a 
cycle  exists),  then  the  tests  is  skipped  for  that 
combination  of  variables. 

When  two  variables  are  being  compared,  the  variable  names 
are  checked  for  agreement  first.   If  this  test  is  passed, 
subscripts  (if  present)  are  checked  next.   Each  subscript 
position  (ie.,  for  each  dimension)  is  checked 
independently.   First,  the  subscript  is  checked  for 
linearity  (see  IOLISTS) .   If  the  subscript  is  not  linear. 
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then  the  worst  case  must  be  assummed  (a  cycle  if  inside 
the  scope  of  an  index  set,  otherwise,  the  linear  ordering 
portion  of  the  cycle) .   If  the  subscripts  are  linear, 
then  data  is  collected  to  allow  the  classification  of  the 
subscript  expression  into  types  for  ease  in  testing  for 
dependence.   Two  simple  tests  are  done  on  all  types  of 
subscripts  to  quickly  determine  if  a  dependence  is 
possible: 

1)  The  GCD  test  from  integer  number  theory  [N1]- 

2)  The  real  root  in  the  interval  test. 

If  only  constants  are  invloved  in  these  subscript 
expressions,  then  special  case  0  is  invoked.   This  case 
is  exact,  and  very  simple,  for  all  it  does  is  check  to 
see  if  the  constants  in  both  subscripts  are  the  same.   If 
the  subscripts  contain  only  one  index  variable  between 
them,  then  special  case  1  is  invoked.   This  case  simply 
enumerates  all  possible  cases  for  the  coefficents 
involved  and  the  ordering  of  the  variables.   While  not  a 
pretty  test,  this  test  is  exact,  and  relatively  simple  to 
understand.   When  the  subscripts  contain  two  index 
variables  between  them  (not  necessarily  distinct),  then 
special  case  2  is  invoked.   This  case  only  works,  at 
present,  if  the  coefficients  on  both  of  the  index 
variables  are  ♦I  or  -1.   Enumeration  of  cases  is  again 
the  method  of  solution.   If  a  subscript  does  not  fit  the 
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qualifications  for  any  special  case,  then  the  worst  case 
condition  is  assummed. 

PARAMETERS; 

IOHFAD  -  head  of  input  output  list  -  POINTER 

IOTAIL  -  tail  of  input  output  list  -  POINTER 

DH  -  dummy  -  POINTER 

DT  -  dummy  -  POINTER 

N  -  number  of  statements  -  FIXED  BIMAHY(15) 

ACTIVE  -  number  of  index  sets  -  FIXED  BINARY  (15) 

SIMUL  -  simultaneous  index  set  -  (*)  BIT(1) 

PFFLAG  -  print  flag  -  BIT(1) 


CONDITIONS; 
none 

MESSAGES; 

A  data  dependency  graph  is  printed. 

Statistics  for  the  various  data  dependency  tests 
are  printed. 

Optional  information  concerning  every  variable 
comparison  is  printed. 
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DETCMPT 


FUNCTION: 

To  reconstruct  DETAIL,  deleting  null  elements,  and 
following  indirect  pointers.   This  procedure  places 
DETAIL  back  into  the  form  that  the  SCANNER  produces. 

METHOD^ 

A   new  copy  of  DETAIL  is  created  by  stepping  through  each 
statement  record  of  every  program  segment  and  copying  the 
DETAIL  entries  described  to  a  temporary  DETAIL  array 
(named  NEW_DITAIL) .   If  the  forward  substitution  table 
(named  FORTAB)  is  allocated,  the  DETAIL  entries  described 
in  FORTAB  are  also  copied.   The  old  DETAIL  is  then 
released,  the  size  of  DETAIL  is  adjusted,  and  it  is 
reallocated.   The  NFW_DETAIL  array  is  then  copied  into 
DETAIL  element  by  element.   The  NEW_DETAIL  array  is  then 
released.   The  control  pointers  into  DETAIL  from  PROGRAM, 
and  FOPTAB  are  updated  during  the  copying  process. 
DETLAST  and  DETSIZE  are  also  updated. 

PARAMETERS: 


FXTPA  -  the  amount  of  free  space  to  be  left  in  copied 

D'TAIL  -  FIXED  BINARY(15) 
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CONDITIONS; 

ND_OVEPFLOW  -  causes  NEW_DBTAIL  to  grow 
BAD_PROGRAM_TYPE  -  raises  the  ERROR  condition 

MESSAGES,: 

DETCMPT    CALLED.       EXTRA    *    ### 

DETCMPT    NEH    DETAIL   OVERFLOW 

DETCMPT    BAD    FNTRY    IN    PROGRAM    ARRAY.       STATEMENT    TYPE    =    ##t 

DETCMPT    C0PY_1,    ND_CURRENT    =    ###,    ENTRY    =    "XM 

DETCMPT    COPY_IT,    PROM    =    #♦#,    SIZE   =    ### 

DETCMPT    COPIFR,     PROGRAN_PTF    =    ###,     DETAIL_PTR   *    ##♦ 

DETCMPT    COPY_BAS,    PPOGRAM_PTR    =    ### ,    DETAIL_PTR    =    ### 

DETCMPT    RETORNS 
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DONIPS 


FONCTIONJ. 

To  normalize  all  DO  loop  index  sets  to  begin  at  zero,  and 
to  have  an  increment,  of  one.   To  discover  as  much  as 
possible  about  the  integer  expressions  used  in 

subscripts. 

METHOD; 

There  are  two  processes  being  done  simultaneously  in  this 
module.   For  ease  in  discussion,  the  processes  will  be 
considered  seperately. 

DO  loop  normalization  is  a  very  simple  process.   Whenever 
a  DO  loop  is  encountered,  it  is  normalized  after  index 
forward  substitution  (which  will  be  discussed  later)  is 
done  on  the  statement.  k   DO  loop  is  of  the  following 
form  prior  to  normalization: 

DO  STMT*  INDEX  =  LOWER,  UPPER,  INCH 

After  normalization: 

DO  STMT#  NEW_INDEX  =  C,  NEW_0PPER,  1 

The  new  upper  bound  is  a  very  simple  function  of  the  old 
lower  bound,  increment,  and  upper  bound: 
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NEW_UPPER  =  (UPPER  -  LOWER)  /  INCR 

The  old  index  variable  can  easily  be  computed  from  the 
new  index  variable: 

INDEX  =  NEW_INDEX  *  INCR  ♦  LOWER 

The  entries  in  the  statement  record  for  the  DO  loop  are 
updated  to  reflect  the  changes  made  by  DO  loop 
normalization.   A  zero  is  filled  into  the  lover  bound 
(rather  than  a  pointer  to  the  symbol  table),  and  a  one  is 
filled  into  the  increment.   The  upper  bound  is  set  to  an 
indirect  pointer  to  the  DETAIL  array  vhere  the  expression 
for  NEW_OPPER  shown  above  is  stored  (indirect  pointers 
are  always  negative) .   The  name  of  the  variable 
associated  with  the  new  upper  bound  is  "U6##  ",  where  the 
#*  comes  from  the  number  of  DO  loops  previously 
encountered  in  this  call  to  DONIFS. 

Similarly,  the  new  index  variable  is  named  "XSt*". 
The  symbol  table  pointer  for  the  new  index  variable 
replaces  the  old,  and  an  entry  is  placed  in  the  index 
forward  substitution  table  (FORTAB)  for  the  old  index 
variable  (see  the  expression  given  above) . 

The  substitution  bounds  on  this  entry  are  egual  to 
the  bounds  of  the  original  DO  loop. 
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The  process  of  index  forward  substitution  is  also  very 
simple.   Pirst,  the  every  statement  is  examined  for  a 
variable  that  is  already  in  the  forward  substitution 
table  (within  the  substitution  bounds,  of  course).   If 
such  a  variable  is  found,  the  occurrence  of  the  variable 
is  replaced  by  an  indirect  pointer  to  the  statement 
defining  the  substitution  for  that  variable.   (Note  that 
these  indirect  pointers  are  followed  and  removed  by  the 
routine  DETCM^T.)   A  variable  can  be  added  to  the  forward 
substitution  list  if  it  is  a  scalar  integer  variable  and 
if  every  operand  on  the  right  hand  side  of  the  assignment 
statement  is  a  known  scalar  integer  variable  (DATA 
statement) ,  a  DO  loop  index,  or  a  constant. 

PARAMETERS: 


PROGRAM_PTR  -  pointer  to  a  program  segment  -  PIXED 
BINARY  (15) 


CONDITIONS^ 

FT_0VERPL0W  -  the  forward  substitution  table  is  grown 
SYMBOL_TABLE_OVERFLOH  -  causes  the  symbol  table  to  grow 
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HFSSAGES: 

DONIPS  CALLED.   PROGRAM_PTR  =  ### 

DONIFS  DO  LOOP  AT  PROGRAM_PTR  =  ###  IS  NORMALIZED. 

NEW  STATEMENT  RECORD:   *#  it  ##  #1  **  **  *#  if  *t 

DONIFS  ADD  SYMBOL  AT  ##,  NAME  IS  "XXXXXX",  TYPE  =  ## 

PONTFS  FORTAB  ENTRY  ADDED  AT  ##.   ENTRY  =(##  ##  -  ##) 

DONIFS  FORTAB  ENTRY  REPLACED  AT  ##.   ENTRY  = (##  #♦  -  #•) 

DONIFS  SUBSTITUTE  AT  DETAIL_PTR  =  # t  OF  i#  FOR  ## 

DONIFS  SYMBOL  TABLE  OVERFLOW 

DONIFS  FORWARD  SUBSTITUTION  TABLE  OVERFLOW 

DONIFS  PFTURNS 
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EBROH 


FUNCTION: 

To  generate  error  and  warning  messages  from  stock 
phrases. 

METHOD :_ 

There  are  two  entry  points  to  ERROR:   ERROR  and  WARN. 
Both  entry  points  use  the  same  message  generation 
process.   The  message  number  is  used  as  an  index  into  a 
table  of  messages.   This  message,  along  with  the 
additional  character  string  parameter  is  printed  out  in  a 
box  made  of  *  characters.   If  the  WARN  entry  point  was 
used,  control  is  returned  to  the  caller.   If  the  ERROR 
entry  point  was  used,  the  ERROR  condition  is  raised. 

PARAMETERS: 

#  -  the  error  message  number  -  PIX1?D  BINARY  (15) 

C  -  the  additional  character  message  -  CHARACTER  (*) 
VARYING 

CONDITIONS: 


none 
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MESSAGES^ 

An  error  or  warning  message  enclosed  in  a  box  of  * 

Possible  messages: 

UNDEFINED  PILE.... NO  OUTPUT  WILL  BE  WRITTEN  TO 
FILE  "XXXXXX" 

UNDEFINED  FILE.... NO  DATA  READ  FROM  FILE  "XXXXXX" 

UNEXPECTED  END  OF  FILE  ON  FILE  "XXXXXX" 
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FORWARD 


FUNCTION: 

To  take  a  portion  of  a  program  segment,  and  to  rewrite 
each  statement  in  this  portion  much  like  index  forward 
substitution,  but  without  worrying  about  known  values  or 
variable  types.   Subscripted  arrays  are  handled  in 
addition  to  scalars. 

HETHQD; 

This  routine  uses  the  data  dependency  graph  to  make  the 
search  space  smaller  for  the  substitution  process.   If 
one  statement  is  dependent  on  another,  then  FORWARD  will 
look  at  the  output  variables  of  the  statement  that  must 
be  executed  first,  and  see  if  an  input  variable  occurs 
with  the  same  name  in  the  other  statement.   If  so,  the 
subscript  lists  are  checked  for  linearity  (see  the 
definition  under  the  module  IOLISTS).   If  the  subscripts 
are  linear,  then  the  compiler  could  perform  the  necessary 
subscript  manipulation  to  substitute.   If  both  subscripts 
are  not  linear,  an  elementary  pattern  match  is  done  to 
see  if  the  subscripts  contain  the  same  variables,  with 
the  same  operators  relating  them.   If  this  is  true,  then 
the  substitution  is  also  performed.   Otherwise,  no 
substitution  is  performed.   Substitution  is  done  by 
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placing  an  indirect  pointer  to  the  substituting 
expression  in  place  of  the  occurrence  of  the  input 
variable.   The  subscript  list  (if  any)  in  DETAIL  that  was 
associated  with  the  variable  that  was  replaced  is  nulled 
out. 

PARAMETERS:, 

IOHEAD  -  list  head  for  the  input  output  lists  -  POINTER 
IOTAIL  -  list  tail  for  the  input  output  lists  -  POINTER 
EH,  DT  -  dummy  parameters  -  POINTER 
N  -  the  number  of  statements  -  FIXED  BINARY  (15) 
PRFLAG  -  the  debug  print  flag  -EIT(1) 


CONDITIONS: 


none 


MESSAGES: 

FORWARD  CALLED  (###  STATEMENTS) 
FORWARD  SUBSTITUTING  ##  AT  DETAIL  (##) 

FORWARD  RETURNS 
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GENER 


FONCTIONl 

To  transform  the  PI  PARTITIONed  three  address  FORTRAN 
program  into  vector  machine  language.   To  collect  certain 
bounds  associated  with  the  complexity  of  the  computation 
of  the  program. 

METHOD^ 

The  input  output  lists  are  used  to  build  the  output  code. 
First  an  index  is  built  that  allows  a  statement  number  to 
reference  an  input  output  list  directly,  rather  than 
having  to  search  the  list  for  it.   The  BEGIN  SEGMENT  node 
is  generated.   Then  each  PI  BLOCK  generates  one  code 
element.   Depending  upon  the  cyclicity  of  the  PI  BLOCK, 
and  upon  the  statement  type,  an  appropriate  subroutine  is 
called  to  generate  the  code  for  that  PI  BLOCK.   The  code 
element  for  the  END  SEGMENT  is  then  built.   At  present, 
there  is  only  one  type  of  code  element,  that  for  the 
three  address  code  element.   Obviously,  this  is  not 
needed  for  such  operations  as  BEGIN_SEGMENT,  END_SEGMENT, 
NOP,  DO,  or  PO_END.   Two  more  types  of  code  elements  are 
currently  planned,  both  of  which  are  subsets  of  the 
current  code  element.   Now  that  all  of  the  code  elements 
are  built,  the  successor  list  is  built  using  the  partial 
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ordering  information  for  the  PI  BLOCKS  in  PORD.   If  a  PI 
BLOCK  has  no  predecessors,  then  this  PI  BLOCK  is  made  a 
successor  to  the  BEGIN_SEGMENT  node.   If  a  PI  BLOCK  has 
no  successors,  and  is  not  inside  a  serial  no,  then  this 
PI  BLOCK  is  made  a  predecessor  to  the  END_SEGHENT  node. 

When  building  the  code  element  for  a  simple  assignment 
statement  or  input/output  statement,  the  output  variable 
is  the  first  variable  on  the  output  list  of  the  input 
output  list  for  this  PI  BLOCK,   Similarly,  the  first  and 
second  operands  come  from  the  first  and  second  operands 
on  the  input  list  of  the  input  output  list  for  this  PI 
BLOCK.   The  operation  to  be  done  is  discovered  by  looking 
at  the  DETAIL  pointer  contained  in  the  last  input  output 
list  atom  examined.   This  pointer  («hen  adjusted 
correctly)  points  to  the  operator  token  in  DETAIL  that  is 
correct  for  this  PI  BLOCK. 

When  building  the  code  element  for  a  recurrence,  the 
RECURRENCE  operation  itself  is  built  first.   The  size  of 
the  A  matrix  is  calculated  by  multiplying  the  product  of 
the  parallel  index  sets  by  the  number  of  nontemporaries 
as  the  first  element  in  the  output  list  of  all  of  the 
input  output  lists  for  this  PI  BLOCK.   The  recurrence 
head  (a  NOP)  is  then  allocated,  and  the  setup  nodes 
(ADDITIONS  ,  MULTIPLIES,  etc.)  for  A  and  C  are  allocated 
and  linked  between  the  recurrence  head  and  the  RECURRENCE 
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operation  itself.   The  recurrence  tail  is  then  allocated 
(a  NOP)  . 

Then  the  output  dispersal  nodes  are  allocated  for 
every  nontemporary  as  the  first  element  of  an  output  list 
of  all  of  the  input  output  lists  for  this  PI  BLOCK  and 
linked  between  the  RECURRENCE  operation  and  the 
recurrence  tail. 

P ARAM ET ER.S  :_ 

SIMUL  -  which  DO  loops  are  parallel  -  (*)  BIT  (1) 


CONDITIONS: 


NO  SIMULATE  -  turns  ELAG. SIMULATE  off 


MESSAGES: 


GENER  iPARTS  =  ##,  ALLOCN(PORD)  =  # 

GENER  ADD_SUC  PRIMITIVE    PRED.OC=##,  PRED. SN= t#. #• 

SUC.OC=t#,   SUC.SN=##.## 

A  decimal  and  English  listing  of  each  code  element. 

GENER  NO_SIMULATE  RAISED  EROM  PROCEDURE  "XXXXXXXXXXX", 
WHILE  PROCESSING  PI  BLOCK  ## 

GENER  RETURNS 
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IFCLUB,  IFREHOV,  IFTHES 


FUNCTION: 

These   routines  direct   the  conditional   statement    removal 
process. 


M^THODj. 

These  routines  are  currently  in  a  state  of  flux  as 
various  implementations  of  the  theory  are  being  tried. 
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INDEX 


FUNCTION^ 


To  remove  all  unnecessary  parentheses  from  subscript 
expressions  by  the  application  of  distribution.   To 
transform  the  FORTRAN  assignment  statements  so  that  there 
is  at  most  one  operation  on  the  right  hand  side. 

METHOD! 

This  module  can  be  thought  of  as  two  distinct 
transformations,  though  a  good  deal  of  common  code  is 
shared  by  the  two.   Each  transformation  will  be  explained 
seperately.   In  any  case,  the  program  segment  is 
processed  one  statement  record  at  a  time. 

When  a  statement  is  encountered  that  could  contain  a 
subscripted  variable,  the  statement  is  parsed,  using  a 
simple  precedence  method,  until  a  beginning  of  subscript 
condition  (a  non  unary  left  parenthesis)  is  found.   At 
this  point,  the  parse  tree  building  routine  is  enabled, 
and  a  parse  of  the  subscript  is  made  until  an  end  of 
subscript  condition  (a  non  unary  right  parenthesis  or 
comma)  is  found.   The  parse  tree  is  then  distributed,  and 
a  corresponding  FORTRAN  like  subscript  entry  is  built  at 
the  end  of  DETAIL.   An  indirect  pointer  is  placed  in  the 
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position  of  the  old  subscript,  and  any  remaining  space  is 
nulled.   If  a  comma  caused  the  end  of  subscript 
condition,  the  beginning  subscript  condition  is  then 
caused,  and  the  process  repeats.   If  a  left  parenthesis 
caused  the  end  of  subscript  condition,  then  the  parse 
tree  builder  is  disenabled,  and  the  parsing  of  the 
current  statement  continues. 

Every  assignment  statement  is  parsed,  with  the  parse  tree 
builder  initially  enabled.   Whenever  a  subscript  list  is 
found,  a  pointer  to  it  is  saved,  but  it  is  ignored.   When 
the  end  of  statement  is  reached,  a  special  routine 
converts  this  parse  tree  to  a  collection  of  FORTRAN 
statements  (placed  at  the  end  of  DETAIL)  in  such  a  way  so 
that  every  statement  has  at  most  one  operation.   this 
necessitates  the  creation  of  temporaries  (much  like  the 
generation  of  three  address  code) .   The  temporaries 
created  have  the  same  number  of  dimensions  as  the  current 
DO  loop  nest  level.   Temporaries  have  a  name  of  "TS##" 
where  the  ##  comes  from  the  number  of  temporaries  created 
during  this  call  to  INDEX.   One  other  simplification  is 
done  at  this  stage.   If  all  of  the  operands  of  one  of 
these  generated  statements  would  be  constants,  then  the 
name  of  the  temporary  is  changed  to  "ZS##",  and  the 
statement  is  not  generated.   If  the  PORTRAN  program  had  a 
scalar  coded  inside  a  DO  loop  as  the  result  (left  hand 
side)  variable  of  an  assignment  statement,  the  scalar 
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variable  vill  be  replaced  by  a  compiler  substituted  array 
with  the  same  number  of  dimensions  as  the  DO  loop  nest 
level.   The  name  of  this  substiuted  variable  is  "S6##". 
Really,  a  multiple  assignment  is  created,  with  the  MS&#t" 
variable  and  the  scalar  both  on  the  left  hand  side.   If 
the  scalar  variable  is  then  used  on  the  eight  hand  side 
of  an  assignment  statemtent  (or  similar. • .CALL,  WRITF, 
etc.) ,  then  the  scalar  variable  will  be  replaced  by  the 
substituted  variable.   This  transforms  more  things  into 
vector  operations,  and  eliminates  some  of  the  common 
forms  of  recurrences.   Code  should  fce  generated  when 
exiting  a  DO  loop  to  assign  the  scalar  value  the  correct 
element  of  the  substituted  array.   This  is  not  done,  as 
yet,  though  a  place  is  reserved  for  doing  it  in  the  INDEX 
routine. 

An  example  of  both  transformations  is  perhaps  best: 

DO  10  I  =  1,15,1 
10  k   =  1  +  2*B(U*(I-1)  ,1*3)  +6*8 


becomes: 


DO  10  I  =  1,15,1 
T&0C(I)  =  2  *  B(4*I-U*1,I43) 
T601  (I)  =  1  ♦  T&00  (I) 
10  S6C3(I),  I    =  T501(I)  ♦  Z&02 
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P  A  R  A  n  EJJR  s :_ 

PROGRAM_PTR   -    program  segment   pointer    -    FIXED   BINARY  (15) 
SUBSW   -    standardize    subscript   switch  -   BIT(1) 
PARSESW    -   generate   one  operation   code   -    BIT(1) 

CONDITIONS! 

TREF_SPACE  -  causes  parse  tree  to  grow 
BAD_SOBSCRIPT  -  raises  the  ERROR  CONDITION 
BAD_OPERATION  -  raises  the  ERROR  CONDITION 
STACK_OVERFLOW  -  causes  the  parse  stack  to  grow 
V_STACK_OVERFLOW  -  causes  the  variable  stack  to  grow 
DETAIL_OVERFLOM  -  causes  DETAIL  to  grow 
V_STACK_0NDEPFLOW  -  raises  the  ERROR  CONDITION 
STACK_ONDERFLOW  -  raises  the  ERROR  CONDITION 
ONMATCHPD_PAREN  -  raises  the  ERROR  CONDITION 
UNSATCHFD  SUBSCRIPT  -  raises  the  ERROR  CONDITION 
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MESSAGFS:_ 

INDEX  CALLED  POF  PROGRAM_PTR  =  ### 

INDEX  STACK  POP  OF  ##• 

INDEX  STACK  =  #t#  ###  •##  ... 

INDEX  STACK  POSH  OF  tit 

INDEX  STACK  OVERFLOW 

INDFX  STACK  UNDERFLOW 

INDEX  V_STACK  POP  OF  ttt 

INDEX  V_STACK  =  tit  tit  *%$    ... 

INDEX  V_STACK  PUSH  OF  *** 

INDEX  V_STACK  OVERFLOW 

INDEX  V_STACK  UNDERFLOW 

INDEX  PARSE  TREE  (it)  (LEFT  RIGHT  DETAIL  SYMBOL) 
(f$)  [i*   **    t9    it)       (it)  (ti  it  **    **)     . 


INDEX  ADD  SYMBOL  AT  $*,    NAME  IS  "XXXXXX",  TYPE=ii 

INDEX  SYMBOL  TABLE  OVERFLOW 

INDEX  EXPRESSION  PARSE  FROM  tit  TO  Uf 

INDEX  EXPRESSION  PARSE  RETURNS 

INDFX  BEGIN  SUBSCRIPT  AT  DETAIL_PTR  =  tti 

INDEX  END  SUBSCRIPT  AT  DETAIL_PTF  =  *§* 

INDEX  VAR  ADD  FOR  tit  AT  DETAIL_PTR  =  tit 

INDEX  VAR  ADD  FOR  ***    AT  DETAIL_PTR  -  ***,    AT  TREE  INDEX 
tit 

INDEX  OPERATOR  ADD  OF  OPERATOR  *% 

INDEX  OPERATOR  ADD  OF  OPERATOR  it,  AT  TREE  INDEX  ♦♦♦ 

INDEX  TTD  CALLED  WITH  TP=ii,  GlF=ti,  G2F=it,  OPR  =  it 

INDEX  DETAIL  PLACE  OF  tit  AT  DETAIL  PTR  =  tti 
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INDEX  DETAIL  OVERFLOW 

INDEX  TPEE  SPACE  OVERFLOW 

INDEX  UNMATCHED  LEFT  PAREN 

INDEX  DISTRIBUTE  CALLED  FOR  TREE_INCEX  =  #tt 

INDEX  TREE  TO  TRIAD  TEMPORARY  "XXXXXX"  SUBSTITUTED  FOR 
"XXXXXX" 

INDEX  TREE  TO  TRIAD,  NO  VARIABLE  OF  NAME  "XXXXXX"  IN  THE 
TEMPORARY  LIST 

INDEX  TREE  TO  TRIAD  CALLED 

INDEX  TOO  MANY  USER  TEMPORARIES 

INDEX  NULL  SUBSCRIPT 

INDEX  TOO  FEW  OPERANDS  FOR  OPERATOR 

INDEX  USER  VARIABLE  "XXXXXX"  CHANGE!  TO  TEMPORARY 
"XXXXXX" 

INDEX  DOt  #t  HAS  A  SUBSCRIPT  LIST  A I  DETAIL_PTR  =  ###, 
AND  ENDS  JUST  PRIOR  TO  PROGRAM  ENTRY  ### 

INDEX  SUBSCRIPT  NORMALIZATION  OF  DETAIL_PTR  tt# 

INDEX  TRIADIZATION  OF  DETAIL_PTR  ### 

INDEX  RETURNS 
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IOLISTS 


FUNCTION! 


To  build  a  list  of  input  and  output  variables  for  each 
statement  record  in  a  program  segment,  and  to  link  all  of 
these  lists  together  in  one  large  list  for  the  entire 

segment. 

METHOD^ 

First,  the  number  of  DO  loops  in  the  program  segment  is 
counted.   This  enables  the  storing  of  subscript 
information  with  positional  identification  of  which  DO 
loop  to  which  the  information  applies.   The  program 
segment  is  then  processed  a  statement  record  at  a  time. 
If  a  statement  can  change  the  value  of  a  variable,  then 
that  variable  is  added  to  the  output  list  for  this 
statement.   If  the  value  of  a  variable  can  be  used  by  a 
statement,  then  that  variable  is  added  to  the  input  list 
for  this  statement.   It  should  be  noted  that  DO  loops 
thus  have  an  output  list  of  the  index  variable,  and  an 
input  list  of  the  variables  used  in  the  bounding 
expressions.   SUBROUTINE,  READ,  and  FUNCTION  statements 
have  only  output  lists.   WRITE,  RETURN,  and  STOP 
statements  have  only  input  lists.   CALL  statements  have 
identical  input  and  output  lists.   There  is  one  case 
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where  an  input  output  list  is  built  when  there  is  not 
necessarily  a  corresponding  statement.   When  the  end  of  a 
00  loop  is  sensed,  an  input  output  list  is  built  with  no 
inDat  or  output  list  (ie.,  just  a  list  head).   Since 
statement  types  are  stored  in  the  list  head,  it  is  easy 
to  differentiate  between  the  various  types  of  statements 
for  which  input  output  lists  are  built  (if  need  be)  . 
?ach  atom  in  the  input  or  output  list  is  identical  in 
form  and  construction,  so  the  construction  will  only  be 
explained  once.   The  symbol  table  pointer  for  the 
variable  concerned  is  obtained  from  a  DETAIL  entry 
associated  with  the  current  statement  record.   This  entry 
is  looked  up  in  the  symbol  table,  and  the  number  of 
dimensions  is  extracted.   An  atom  is  then  allocated  for 
the  variable.   If  the  variable  is  dimensioned,  then  the 
subscripts  are  checked  for  linearity.   A  linear  subscript 
is  one  that  can  be  expressed  as: 

CO  +  C1*X1  ♦  C2*X2  ♦  .... 

Where  the  Ci  are  integer  values  known  at  compile  time, 
and  the  Xi  are  DC  loop  index  variables.   If  the 
subscripts  are  linear,  then  the  Ci  are  copied  in  to  a 
matrix  with  the  corresponding  Xi  being  known  by  the 
position  of  the  Ci  in  the  matrix.   If  the  subscript  is 
not  linear,  then  the  expression  must  be  calculated  at  run 
time,  and  hence  the  variables  involved  in  the  subscript 
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must  be  added  to  the  input  list  of  the  statement.   The 
linearity  of  the  subscript  is  saved* 


PAR£U1FRS:_ 

PROGPTR  -  pointer  to  a  program  segment  -  FIXED  BINARY (15) 
IOHEAD  -  the  head  of  the  iolist  to  be  built  -  POINTER 
IOTAIL  -  the  tail  of  the  iolist  to  be  built  -  POINTER 
DH,  DT  -  dummy  -  POINTER 

ACTIVE  -  the  number  of  DO  loops  -  PIXED  BINARY (15) 
PRFLAG  -  the  debug  print  flag  -  BIT  (1) 

CONDITIONS: 

AREA  -  causes  input  output  list  space  to  grow 

MESSAGES.; 

IOLISTS  CALLED,  PROGRAM  (###) 

A  listing  of  the  iolists  for  the  entire  program  segment. 

IOLISTS  RETURNS 
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MASTER 


FUNCTION; 

To  control  the  transformation  process  from  the  tokenized 
version  of  the  FORTRAN  program  generated  by  the  SCANNER 
to  the  three  address  code  product.   To  title  and  number 
output  pages.   To  grow  the  correct  AREA  when  more  space 
is  needed. 

METHOD:. 

First,  the  options  to  be  used  for  this  particular  run  are 
obtained  by  calling  the  routine  OPTION.   Then  the  scanned 
program  is  obtained  by  calling  the  routine  READIN.   A 
FORTRAN  version  of  the  tokenized  program  is  printed  out 
now,  and  after  every  major  transformation  step  by  calling 
the  routine  UNSCAN.   The  routine  CCARDS  is  called  to 
define  any  scalar  integer  variables  that  are  known  at 
scan  time.   Then  DO  loops  are  normalized,  and  index 
forward  substitution  is  done  by  calling  the  routine 
DONIFS.   The  DETAIL  array  is  compacted  by  calling  the 
routine  DETCMPT.   The  CCARDS  routine  is  called  again,  to 
allow  the  definition  of  DO  loop  upper  bounds  ("0&##n 
variables) ,  if  desired.   The  routine  INDEX  is  then  called 
to  standardize  the  subscripts.   The  DETAIL  array  is 
compacted,  and  SSDEL  is  called  to  eleminate  unnecessary 
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parentheses  associated  with  beginning  and  ending 
subscripts.   If  the  forward  substitution  option  vas 
selected,  then  these  steps  are  done.   For  every 
contiguous  group  of  assignment  and  input/output 
statements  outside  of  all  DO  loops,  an  input  output  list 
is  built  using  the  routine  IOLXSTS.   Then,  a  data 
dependency  graph  is  built  using  the  routine  DEPEND.   The 
statement  level  forward  substitution  is  then  done  by 
calling  the  routine  FORWARD.   The  now  useless  input 
output  list  is  released  by  calling  the  routine  IOLFRE. 
This  process  is  repeated  until  no  more  contiguous  groups 
of  statements  (as  described  above)  exist.   Then,  the 
DETAIL  array  is  compacted.   This  ends  the  special  code 
done  for  the  forward  substitution  option.   The  FORTRAN 
program  is  then  rewritten  as  three  address  code  FORTRAN 
by  calling  the  routine  INDEX.   Excessive  parentheses  are 
again  deleted  by  the  routine  SSDEL.   If  IF  statements  are 
to  be  removed  (and  all  must  be) ,  input  output  lists  are 
built  for  the  entire  program  by  the  routine  IOLISTS,  and 
the  routine  IFREMOV  is  called  to  remove  all  of  the  IF 
statements  that  it  can  by  using  MODE  patterns,  reordering 
computations,  etc.   The  input  output  lists  are  released 
when  IFREMOV  returns.   In  case  there  are  any  IF 
statements  left,  that  IFREMOV  was  unable  to  handle,  the 
routine  IFCLOB  is  called  to  segment  the  program  into  IF 
free  pieces.   The  routine  COMPILE  is  then  called  for  each 
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program  segment  to  control  the  generation  of  code  for 
that  segment. 

If  the  AREA  condition  is  raised,  the  ON-BLOCK  in  MASTER 
decides  which  area  to  grow  by  looking  at  the  name  of  the 
procedure  that  caused  the  AREA  condition  to  be  raised. 

Titles  and  page  numbering  are  handled  by  an 
FNDPAGF(SYSPRINT)  ON-BLOCK. 


PARAMETERS:^ 

ST  -  the  subtitle  -  CHARACTER (65)  VARYING 

CONDITIONS^ 

AREA  -  causes  the  appropriate  AREA  to  grow 

FNDPAGF  (SYSPRINT)  -  causes  titles  and  page  numbers 

ERROR  -  causes  the  FINISH  condition 

FINISH  -  causes  debug  printing  and  termination 

ONDEFINEDFILE  (CODEOUT)  -  causes  no  code  to  be  generated 

UNDEFINEDFILE(SCAN)  -  causes  no  SCANNER  like  output 
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MESSAGES; 

MASTER  CALLED 

MASTER  BEGIN  COMPILATION  OP  SEGMENT  ### 

MASTER  ARFA  CONDITION  RAISED 

MASTFR  SYMBOL  TABLE  SIZE  =  ♦##,  CODE_SIZE  =  ### 

MASTER  RETORNS 
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OPTION 


FUNCTION: 

To  allow  the  user  to  change  the  default  values  of  the 
various  option  flags,  debug  flags,  and  array  sizes  for 
the  needs  of  the  particular  program  being  analyzed. 

METHODj. 

A  GET  DATA  is  done  on  the  file  DEFAULT  for  three 
strucures:   AFEAINF,  FLAG,  and  DEBUG.   A  GET  DATA  for  the 
same  three  structures  is  then  done  on  the  file  OPTIONS. 
This  allows  differing  sets  of  default  options  for 
different  users,  with  a  simple  method  of  overriding  the 
default  options. 


PARAMETERS 


none 
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CONDITIONS; 

UNDEPINEDPILE (OPTIONS)  -  turns  DEBUG. OPTION  on 
UNDEFINEDPILE(DEPAULT)  -  turns  DEBDG. OPTION  on 
ENDFILE  (OPTIONS)  -  turns  DEBUG. OPTION  on 
ENDFILE  (DEFAULT)  -  turns  DEBUG. OPTION  on 
NAME  (OPTIONS)  -  ignores  the  item 
NAME  (DEFAULT)  -  ignores  the  item 

M^SSAGES^ 

OPTION  CALLED 

OPTION  END  OF  FILE  ON  OPTIONS 

OPTION  NAME  CONDITION  RAISED  ON  MX" .   SKIPPING  TO  NEXT 
NAME 

OPTION  OPTIONS  FILE  NOT  DEFINED.   DEFAULTS  ASSUHMED 

A  listing  of  all  of  the  options  that  will  be  used. 

OPTION  RETURNS 
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PIPART 


FUNCTION: 

To  locate  all  cycles  in  a  data  dependency  graph,  and  to 
qenerate  a  new  dependence  graph  with  each  of  the  cycles 
corresponding  to  a  single  entry.   To  eliminate  transative 
dependencies. 

METHOD; 

The  transative  closure  of  the  data  dependency  matrix  is 
computed.   This  new  matrix  contains  entries  if  there  is  a 
path  of  any  length  between  two  nodes.   Hence,  if  the 
diagonal  symmetric  entries  are  true  for  two  statements, 
then  the  two  statements  are  in  the  same  cycle.   The 
statementts  that  are  in  the  same  cycle  are  collected  in  a 
straightforward  manner.   Each  of  the  cycles,  or 
independent  statements  is  called  a  PI  BLOCK.   The  partial 
ordering  of  the  PI  BLOCKS  is  obtained  from  the  data 
dependency  matrix  by  ORing  rows  together  that  correspond 
to  the  same  PI  BLOCK.   This  ordering  is  not  minimal,  as 
many  links  exist  that  could  be  inferred  by  the  transative 
law.   At  present,  no  attempt  is  made  to  remove  these 
extra  links.   The  only  harm  in  this  outlook  is  that  more 
space  is  needed  for  successor  lists  during  the  code 
generation  process. 
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PARAMETERS; 

N  -  the  number  of  statements  -  PIXEC  BINARY  (15) 

tPARTS  -  the  number  of  PI  BLOCKS  found  -  PIXED  BIHARY(15) 

PPPLAG  -  the  debug  print  flag  -  BIT  (1) 


CONDITIONS: 


none 


MESSAGES: 


A  partition  of  the  statements  into  PI  BLOCKS, 
A  partial  ordering  of  the  PI  BLOCKS. 
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PRINTER 


FUNCTION^ 

To  provide  a  debug  output  of  the  symbol  table  (SYMTAB) 
and  the  tokenized  FORTRAN  program  (DETAIL  and  PROGRAH) . 

METHOD:. 

PROGRAM,  DETAIL,  and  SYMTAB  are  printed  out  in  decimal, 
using  column  labels  where  appropriate.   The  printing  of 
the  symbol  table  is  optional.   The  next  record  pointer 
(offset  -M)  is  used  in  conjunction  with  the  statement 
type  (offset  ♦())  to  disect  the  PROGRAM  array  into 
statement  records.   The  semicolon  (token  99)  is  used  to 
break  the  DETAIL  array  into  records.   Each  of  the  three 
tables  is  printed  until  the  zalue  of  PRGLAST,  DETLAST,  or 
SYMLAST  is  reached. 

PARAMETERS^ 

PF.SYM  -  print  the  symbol  table  flag  -  BIT(1) 


CONDITIONS: 


none 
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MESSAGES :, 

A  listing  of  PROGRAM  broken  at  statement  records 

A  listing  of  the  symbol  table 

A  listing  of  DETAIL  broken  by  semicolons 
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READIN 


FUNCTION! 

To  read  in  the  tokenized  FORTRAN  program. 

METHOD: 

A  file  with  a  name  of  SCANOUT  is  opened.   Two  control 
words  are  read  in  that  govern  the  size  of  PROGRAM,  and 
the  amount  of  space  already  used  in  PROGRAM  (PRGSIZE  and 
PRGLAST) .   PROGRAM  is  then  allocated,  and  filled  with 
another  read.   Similar  things  happen,  and  DFTAIL  and 
SYMTAB  are  read  in.   Two  dummy  words  are  read  in  (the 
first  of  which  contains  the  title  length,  but  it  is 
ignored) ,  and  then  the  title  is  read  in.   The  initial 
segment  head  (SEGMENT)  is  allocated,  and  initialized. 


PARAMETERS: 


TITLE  -  the  title  of  this  program  -  CHARACTER  (65) 


CONDITIONS: 


TJNDEFINFDFILE  (INFILE)  -  raises  the  ERROR  condition 

ENDFILE (INFILE)  -  raises  the  ERROR  condition  or  a  blank 

title 
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MESSAGES: 

READIN  CALLED 

READIN  ERROR  -  NO  INPUT  FBOH  SCANNER 

PEADIN  NOT  ENOUGH  DATA  FROM  SCANNER 

READIN  RETURNS,  PRGSIZE=##,  SYHSIZE=##,  DETSIZE=## 
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SCANNER 


FUNCTION-! 

To  take  a  card  image  FORTRAN  program  and  perform  standard 
lexical  analysis  functions.   To  remove  all  expressions 
from  the  conditional  portions  of  IF  satements. 

KETHOD^ 

An  input  card  is  read  from  the  file  SYSIN,  and  the 
statement  number  is  extracted  from  it  (if  it  exists) . 
Then  the  first  token  is  extracted  from  the  card.   If  this 
token  is  one  of  the  FORTRAN  key  words,  a  special  routine 
is  called  to  handle  that  particular  type  of  statement. 
Otherwise,  the  statement  must  be  an  assignment  statement. 
Needless  to  say,  this  overlooks  the  two  cases  of  possible 
ambiguities  in  FORTRAN  for  this  type  of  parse:   a 
variable  named  DO  or  a  variable  named  FORMAT.   Also, 
statements  such  as  INTEGER  FUNCTION,  and  REAL  FUNCTION 
are  parsed  as  INTEGER  and  REAL  statements,  respectively, 
because  of  the  use  of  blank  as  a  delimiter.   While  these 
would  all  be  serious  flaws  in  a  production  FORTRAN 
compiler,  only  a  small  amount  of  inconvience  is  caused  in 
the  environment  in  which  the  SCANNER  will  be  used.   The 
resolution  of  statement  labels  is  done  after  the  entire 
program  is  processed.   A  list  of  all  of  the  statement 
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number  definitions  and  references  is  kept  until  the  END 
card  is  processed.   The  expression  that  was  coded  inside 
an  IF  statement  is  removed  by  the  IP  statement  processor, 
and  it  is  assigned  to  a  temporary  variable  "L&t#n.   This 
temporary  variable  is  then  placed  inside  the  IP  statement 
in  place  of  the  expression  that  was  removed.   A  certain 
number  of  control  cards  are  accepted  by  the  SCAHHER. 
Each  control  card  begins  with  a  %,    and  then  one  of 
several  key  words:   DETAIL,  PROGRAM,  SYHBOL,  or  DSER.   If 
one  of  these  key  words  is  recognized  as  the  first  token 
on  a  control  card,  the  next  token  on  the  card  is  assummed 
to  be  a  number  giving  the  size  of  the  respective  table. 
If  the  first  token  is  not  one  of  the  control  card  key 
words,  then  it  is  assummed  to  be  the  name  of  a  user 
defined  builtin  function.   The  second  token  on  the 
control  card  is  assummed  to  be  a  nutrber  giving  the 
relative  difficulty  of  calculating  this  function  (from 
one  to  four).   The  control  cards  for  the  sizes  of  the 
various  tables  should  be  included  prior  to  any  other 
cards  in  the  input  file  SYSIN,  if  included  at  all. 
Control  cards  defining  user  builtin  functions  should 
appear  before  the  first  occurrence  of  that  function.   The 
tokenized  output  is  written  on  the  file  SCANOOT  in  the 
format  expected  by  READIN. 
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PARAMETFRSj. 

TITLE  -  the  title  of  the  program  -  CHARACTER (65)  VARYING 

CONDITIONS :_ 

FNDFILF(SYSIN)    -    sets   end   of   data   flag 

MESSAGES :_ 

A  listing  of  the  program  to  be  scanned. 

A  listing  of  the  generated  PROGRAH  array. 

A  listing  of  the  generated  symbol  table* 

A  listing  of  the  generated  DETAIL  array. 
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SEGHENT 

FUNCTION; 

To  find  all  distinct  execution  paths  in  a  program. 

METHOD:, 

Travel  all  execution  paths,  keeping  track  of  what  places 
have  been  visited  before.   When  first  encountering  a 
statement  visited  before,  terminate  the  current  execution 
path,  and  break  the  old  one  that  was  encountered  at  the 
point  of  encounter.   Collect  the  beginning  pointer  for 
each  program  segment,  and  the  ending  pointer. 

PARAMFT.EFSJ. 
non© 

CONDITIONS; 

none 

MESSAGES: 


none 
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SSDEL 


FUNCTION; 

To  delete  excess  parentheses  caused  by  having  DETCMPT 
follow  indirect  pointers  in  a  general  way.   To  change 
begin  subscript  tokens  to  left  parentheses,  and  end 
subscript  tokens  to  right  parentheses. 

METHOD: 

Every  entry  in  DETAIL  is  scanned  for  begin  and  end 
subscript  tokens,  and  these  are  translated  to  either 
parentheses  or  nulls,  depending  upon  the  surrounding 
conditions.   Parentheeses  are  deleted  in  the  neighborhood 
of  a  comma  inside  a  subscript. 

PARAMETERS:. 
none 

CONDITION  Sj. 
none 

MESSAGES  :_ 
none 
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ON SCAB 


FUNCTION: 

To  take  a  tokenized  version  of  a  FORTRAN  program,  and 
generate  a  readable  FORTRAN  program  similar  to  that  which 
"humans"  use. 

METHOD.: 

The  symbol  table  is  printed  first  as  a  series  of  REAL, 
INTEGER,  and  COMPLEX  statements.   Then  a  statement  record 
is  selected  from  PROGRAM,  and  a  special  section  of  code 
is  executed,  depending  upon  the  statement  type.   The  rest 
is  rather  straight  forward  looking  up  of  operators  in 
tables,  and  the  looking  up  of  variables  in  the  symbol 
table. 

PARAMETERS: 

PRGPTR  -  pointer  to  the  program  segment  -  FIXED 
BINARY(15) 

MESSAGE  -  a  comment  card  to  be  the  first  line  - 
CHARACTER (72) 

CONDITIONS: 
none 

MESSAGES.; 

A  FORTRAN  like  listing  is  produced. 
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WRITEIT 


FUNCTION^ 

To  generate  the  output  file  of  generated  three  address 
code.   To  qenerate  the  output  file  of  SCANNER  like  ouput 
plus  the  partial  ordering  information  necessary  to 
distribute  DC  loops. 

METHOD^ 

The  necessary  information  is  copied  into  the  HEADER,  and 
the  HEADER  is  written  to  the  file  CODEOOT.   Then,  the 
entire  CODSPAC  AREA  is  written  to  the  file  CODEOUT. 
Then,  the  PROGRAM,  DETAIL,  TITLE,  and  SYMTAB  are  written 
to  the  file  SCAN  just  like  READIN  expects  them.   The 
discriptive  information  for  the  PI  PARTITION  is  then 
written  to  the  file  SCAN,  followed  by  the  partition 
information  (PART  and  START) . 

PARAMETERS: 
none 

CONDITIONS  I 
none 

MESSAGES: 

WPIT'IT  SEGMENT  #  ## ,  DEPEND*  ## 
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appeed;x  C 


Data   Structures 
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SCANNEF  OUTPUT 

The  SCANNKF  data  structures  are  explained  in  [S2], 


Input  Output  Lists 

The  input  output  lists  are  actually  a  list  when  looked  at 
from  a  statement  level.   Each  statement  has  an  entry 
(IOLIST) ,  and  these  entries  are  linked  together  in  card 
reader  linear  order.   For  each  statement,  there  are  two 
possible  lists.   The  first  list  contains  entries  for  all 
variables  whose  values  are  required  for  the  evaluation  of 
the  statement,  and  the  second  list  contains  entries  for 
all  variables  whose  values  are  set  by  the  statement.   The 
variables  in  each  of  these  lists  are  represented  by 
ATOMS.   The  IOLIST  entry  contains  information  about  which 
of  the  index  sets  apply  to  this  statement,  and  offsets 
into  the  SCANNER  data  structures  foe  this  statement.   The 
ATOMS  contain  information  about  the  subscripts  associated 
with  this  variable,  both  in  the  original  tokenized  form, 
and  in  a  compact  coefficient  form. 
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Data  Dependency 

A  square  bit  matrix  with  a  dimension  equal  to  the  number 
of  statements  is  the  result  of  the  calculation  of  the 
near  minimal  valid  partial  ordering. 


PI  Partitioning 

A  square  matrix  vith  a  dimension  equal  to  the  number  of 
PI  blocks  represents  the  partial  ordering  of  PI  blocks. 
Two  linear  arrays  give  the  correspondence  between 
statements  and  PI  blocks.  The  first  gives  the  starting 
location  of  the  statement  number  list  for  this  PI  block 
in  the  second.  The  second  is  a  list  of  statement 
numbers,  ordered  by  PI  block. 


Compiler  Output 

K   connected  acyclic  directed  graph  with  one  source  and 
one  sink  is  the  output  of  the  compiler.   The  vertices  are 

composed  of  operations  (CODE  elements) ,  and  the  edges  by 
SUCCESSOR  LISTS. 
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Assorted  Structures 

DEBOG  -  A  bit  structure  vhich  controls  output  volume 

FLAG  -  Compiler  options 

TYPFINF  -  A  mnemonic  association  for  scanned  tokens 
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